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Detailed protocol 


As noted in the Materials and Methods section of the paper, we did the following: 

Rarefaction analysis Rarefaction analysis was performed by randomly selecting subsets (without replacement) of between 1 and 627 (all), 232 (Cluster A) or 
108 (Cluster B) mycobacteriophages and determining the numbers of phamilies represented. This was repeated 10,000 times to generate a mean number of 
phamilies observed given a number of phage genomes selected. The means of the accumulated numbers of phams and the numbers of new phages identified 
are plotted as the function of the number of genomes selected at random. The observed numbers were fit to a hyperbolic function for 50% of the sample (i.e., 1 
to 314, 116 or 54 genomes for all, Cluster A or Cluster B phages, respectively); Hanes-Woolf regression was used to estimate PhamMax and Km of the 
hyperbola: 

Nphams = (Phammax * NGenomes) / (Km + NGenomes) 

where Ncenomes is the number of genomes sampled, Nppams is the number of total phams seen within those genomes, Pham, is the total number of phams 
among all mycobacteriophage genomes, and Kj, is the number of genomes required to sample one half of Phampyigx. The lack of fit of the observed data to the 
hyperbola—with the observed data reflecting infinite size—suggests that the overall population is dynamic. The lack of hyperbolic fit of the data does not result 
from outliers such as phages with highly deviant GC%, because removing these does not improve the fit. The fit is also not substantially improved by analysis of 
the two largest clusters, Cluster A and Cluster B (Figure 7), suggesting that the dynamic nature of the gene pool is not an artifact of examining independent 
phage clusters with separate gene pools. To model this behavior, we modified Equation 1 to include the introduction of novel phams via recombination with 
outside, non-mycobacteriophage genomes: 

Nphams = NcGenomes * Cphage + ((Phamyax * NGenomes) / (Km + NGenomes)) 

where Cphage is the number of outside phams seen in each phage. The value of Cppage was estimated from Figure 7B and new values for Phampax and 


Kpham were estimated by Hanes-Woolf regression following data normalization. 
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